action consistency
Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving
Jin, Guizhe, Li, Zhuoren, Leng, Bo, Yu, Ran, Xiong, Lu, Sun, Chen
Reinforcement Learning (RL) is increasingly used in autonomous driving (AD) and shows clear advantages. However, most RL-based AD methods overlook policy structure design. An RL policy that only outputs short-timescale vehicle control commands results in fluctuating driving behavior due to fluctuations in network outputs, while one that only outputs long-timescale driving goals cannot achieve unified optimality of driving behavior and control. Therefore, we propose a multi-timescale hierarchical reinforcement learning approach. Our approach adopts a hierarchical policy structure, where high- and low-level RL policies are unified-trained to produce long-timescale motion guidance and short-timescale control commands, respectively. Therein, motion guidance is explicitly represented by hybrid actions to capture multimodal driving behaviors on structured road and support incremental low-level extend-state updates. Additionally, a hierarchical safety mechanism is designed to ensure multi-timescale safety. Evaluation in simulator-based and HighD dataset-based highway multi-lane scenarios demonstrates that our approach significantly improves AD performance, effectively increasing driving efficiency, action consistency and safety.
HYDRA: Hybrid Robot Actions for Imitation Learning
Belkhale, Suneel, Cui, Yuchen, Sadigh, Dorsa
Imitation Learning (IL) is a sample efficient paradigm for robot learning using expert demonstrations. However, policies learned through IL suffer from state distribution shift at test time, due to compounding errors in action prediction which lead to previously unseen states. Choosing an action representation for the policy that minimizes this distribution shift is critical in imitation learning. Prior work propose using temporal action abstractions to reduce compounding errors, but they often sacrifice policy dexterity or require domain-specific knowledge. To address these trade-offs, we introduce HYDRA, a method that leverages a hybrid action space with two levels of action abstractions: sparse high-level waypoints and dense low-level actions. HYDRA dynamically switches between action abstractions at test time to enable both coarse and fine-grained control of a robot. In addition, HYDRA employs action relabeling to increase the consistency of actions in the dataset, further reducing distribution shift. HYDRA outperforms prior imitation learning methods by 30-40% on seven challenging simulation and real world environments, involving long-horizon tasks in the real world like making coffee and toasting bread. Videos are found on our website: https://tinyurl.com/3mc6793z
Probabilistic Loss and its Online Characterization for Simplified Decision Making Under Uncertainty
Zhitnikov, Andrey, Indelman, Vadim
It is a long-standing objective to ease the computation burden incurred by the decision making process. Identification of this mechanism's sensitivity to simplification has tremendous ramifications. Yet, algorithms for decision making under uncertainty usually lean on approximations or heuristics without quantifying their effect. Therefore, challenging scenarios could severely impair the performance of such methods. In this paper, we extend the decision making mechanism to the whole by removing standard approximations and considering all previously suppressed stochastic sources of variability. On top of this extension, our key contribution is a novel framework to simplify decision making while assessing and controlling online the simplification's impact. Furthermore, we present novel stochastic bounds on the return and characterize online the effect of simplification using this framework on a particular simplification technique - reducing the number of samples in belief representation for planning. Finally, we verify the advantages of our approach through extensive simulations.